Training Deep Networks with Structured Layers by Matrix Backpropagation

نویسندگان

  • Catalin Ionescu
  • Orestis Vantzos
  • Cristian Sminchisescu
چکیده

Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features. The power of deep networks stems both from their ability to perform local computations followed by pointwise non-linearities over increasingly larger receptive fields, and from the simplicity and scalability of the gradient-descent training procedure based on backpropagation. An open problem is the inclusion of layers that perform global, structured matrix computations like segmentation (e.g. normalized cuts) or higher-order pooling (e.g. log-tangent space metrics defined over the manifold of symmetric positive definite matrices) while preserving the validity and efficiency of an end-to-end deep training framework. In this paper we propose a sound mathematical apparatus to formally integrate global structured computation into deep computation architectures. At the heart of our methodology is the development of the theory and practice of backpropagation that generalizes to the calculus of adjoint matrix variations. The proposed matrix backpropagation methodology applies broadly to a variety of problems in machine learning or computational perception. Here we illustrate it by performing visual segmentation experiments using the BSDS and MSCOCO benchmarks, where we show that deep networks relying on second-order pooling and normalized cuts layers, trained end-to-end using matrix backpropagation, outperform counterparts that do not take advantage of such global layers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NRL Memorandum Report: Gradually DropIn Layers to Train Very Deep Neural Networks: Theory and Implementation

We introduce the concept of dynamically growing a neural network during training. In particular, an untrainable deep network starts as a trainable shallow network and newly added layers are slowly, organically added during training, thereby increasing the network’s depth. This is accomplished by a new layer, which we call DropIn. The DropIn layer starts by passing the output from a previous lay...

متن کامل

Generalized BackPropagation, Étude De Cas: Orthogonality

This paper introduces an extension of the backpropagation algorithm that enables us to have layers with constrained weights in a deep network. In particular, we make use of the Riemannian geometry and optimization techniques on matrix manifolds to step outside of normal practice in training deep networks, equipping the network with structures such as orthogonality or positive definiteness. Base...

متن کامل

Stacked Training for Overfitting Avoidance in Deep Networks

When training deep networks and other complex networks of predictors, the risk of overfitting is typically of large concern. We examine the use of stacking, a method for training multiple simultaneous predictors in order to simulate the overfitting in early layers of a network, and show how to utilize this approach for both forward training and backpropagation learning in deep networks. We then...

متن کامل

Building Deep Networks on Grassmann Manifolds

Learning representations on Grassmann manifolds is popular in quite a few visual recognition tasks. In order to enable deep learning on Grassmann manifolds, this paper proposes a deep network architecture by generalizing the Euclidean network paradigm to Grassmann manifolds. In particular, we design full rank mapping layers to transform input Grassmannian data to more desirable ones, exploit re...

متن کامل

A Riemannian Network for SPD Matrix Learnin

Symmetric Positive Definite (SPD) matrix learning methods have become popular in many image and video processing tasks, thanks to their ability to learn appropriate statistical representations while respecting Riemannian geometry of underlying SPD manifolds. In this paper we build a Riemannian network architecture to open up a new direction of SPD matrix non-linear learning in a deep model. In ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1509.07838  شماره 

صفحات  -

تاریخ انتشار 2015